由胰腺管网络的具有挑战性的分割任务激发,本文解决了两个通常遇到生物医学成像问题的问题:分割的拓扑一致性,以及昂贵或困难的注释。我们的贡献如下:a)我们提出了一个拓扑评分,该评分衡量了预测和地面真理分割之间的拓扑和几何一致性,应用于模型选择和验证。 b)我们在时间序列图像数据上为这一困难的嘈杂任务提供了完整的深度学习方法。在我们的方法中,我们首先使用半监管的U-NET体系结构,适用于通用分割任务,该任务共同训练自动编码器和分割网络。然后,随着时间的流逝,我们使用循环的跟踪来进一步改善预测的拓扑。这种半监督的方法使我们能够利用未经通知的数据来学习特征表示,尽管我们的带注释的培训数据的变化非常有限,但该特征表示具有较高可变性的数据。我们的贡献在具有挑战性的分割任务上得到了验证,从嘈杂的实时成像共聚焦显微镜中定位胎儿胰腺中的管状结构。我们表明,我们的半监督模型不仅优于完全监督和预训练的模型,而且还优于在训练过程中考虑拓扑一致性的方法。此外,与经过平均循环得分为0.762的CLDICE的U-NET相比,我们的方法的平均环路得分为0.808。
translated by 谷歌翻译
Are extralinguistic signals such as image pixels crucial for inducing constituency grammars? While past work has shown substantial gains from multimodal cues, we investigate whether such gains persist in the presence of rich information from large language models (LLMs). We find that our approach, LLM-based C-PCFG (LC-PCFG), outperforms previous multi-modal methods on the task of unsupervised constituency parsing, achieving state-of-the-art performance on a variety of datasets. Moreover, LC-PCFG results in an over 50% reduction in parameter count, and speedups in training time of 1.7x for image-aided models and more than 5x for video-aided models, respectively. These results challenge the notion that extralinguistic signals such as image pixels are needed for unsupervised grammar induction, and point to the need for better text-only baselines in evaluating the need of multi-modality for the task.
translated by 谷歌翻译
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to autoregressive language generation. We instead view diffusion as a complementary method that can augment the generative capabilities of existing pre-trained language models. We demonstrate that continuous diffusion models can be learned in the latent space of a pre-trained encoder-decoder model, enabling us to sample continuous latent representations that can be decoded into natural language with the pre-trained decoder. We show that our latent diffusion models are more effective at sampling novel text from data distributions than a strong autoregressive baseline and also enable controllable generation.
translated by 谷歌翻译
Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior.
translated by 谷歌翻译
Denoising diffusion probabilistic models and score matching models have proven to be very powerful for generative tasks. While these approaches have also been applied to the generation of discrete graphs, they have, so far, relied on continuous Gaussian perturbations. Instead, in this work, we suggest using discrete noise for the forward Markov process. This ensures that in every intermediate step the graph remains discrete. Compared to the previous approach, our experimental results on four datasets and multiple architectures show that using a discrete noising process results in higher quality generated samples indicated with an average MMDs reduced by a factor of 1.5. Furthermore, the number of denoising steps is reduced from 1000 to 32 steps leading to a 30 times faster sampling procedure.
translated by 谷歌翻译
自动驾驶汽车必须能够可靠地处理不利的天气条件(例如,雪地)安全运行。在本文中,我们研究了以不利条件捕获的转动传感器输入(即图像)的想法,将其下游任务(例如,语义分割)可以达到高精度。先前的工作主要将其作为未配对的图像到图像翻译问题,因为缺乏在完全相同的相机姿势和语义布局下捕获的配对图像。虽然没有完美对准的图像,但可以轻松获得粗配上的图像。例如,许多人每天在好天气和不利的天气中驾驶相同的路线;因此,在近距离GPS位置捕获的图像可以形成一对。尽管来自重复遍历的数据不太可能捕获相同的前景对象,但我们认为它们提供了丰富的上下文信息来监督图像翻译模型。为此,我们提出了一个新颖的训练目标,利用了粗糙的图像对。我们表明,我们与一致的训练方案可提高更好的图像翻译质量和改进的下游任务,例如语义分割,单眼深度估计和视觉定位。
translated by 谷歌翻译
我们首次建议使用基于多个实例学习的无卷积变压器模型,称为多个实例神经图像变压器(Minit),以分类T1Weighted(T1W)MRIS。我们首先介绍了为神经图像采用的几种变压器模型。这些模型从输入体积提取非重叠的3D块,并对其线性投影进行多头自我注意。另一方面,Minit将输入MRI的每个非重叠的3D块视为其自己的实例,将其进一步分为非重叠的3D贴片,并在其上计算了多头自我注意力。作为概念验证,我们通过训练模型来评估模型的功效,以确定两个公共数据集的T1W-MRIS:青少年脑认知发展(ABCD)和青少年酒精和神经发展联盟(NCANDA)(NCANDA) 。博学的注意力图突出了有助于识别脑形态计量学性别差异的体素。该代码可在https://github.com/singlaayush/minit上找到。
translated by 谷歌翻译
由于大规模数据集的可用性,通常在特定位置和良好的天气条件下收集的大规模数据集,近年来,自动驾驶汽车的感知进展已加速。然而,为了达到高安全要求,这些感知系统必须在包括雪和雨在内的各种天气条件下进行稳健运行。在本文中,我们提出了一个新数据集,以通过新颖的数据收集过程启用强大的自动驾驶 - 在不同场景(Urban,Highway,乡村,校园),天气,雪,雨,阳光下,沿着15公里的路线反复记录数据),时间(白天/晚上)以及交通状况(行人,骑自行车的人和汽车)。该数据集包括来自摄像机和激光雷达传感器的图像和点云,以及高精度GPS/ins以在跨路线上建立对应关系。该数据集包括使用Amodal掩码捕获部分遮挡和3D边界框的道路和对象注释。我们通过分析基准在道路和对象,深度估计和3D对象检测中的性能来证明该数据集的独特性。重复的路线为对象发现,持续学习和异常检测打开了新的研究方向。链接到ITHACA365:https://ithaca365.mae.cornell.edu/
translated by 谷歌翻译
神经科学研究的一种基本方法是基于神经心理学和行为措施,即某些因素(例如,与生活事件相关)是否与结果(例如抑郁症)有关。近年来,深度学习已成为通过预测一系列因素的结果并确定推动预测的最“信息性”的结果,成为进行此类分析的潜在替代方法。但是,这种方法的影响有限,因为其发现与支持假设的因素的统计意义无关。在本文中,我们根据排列测试的概念提出了一种灵活且可扩展的方法,该方法将假设检验集成到数据驱动的深度学习分析中。我们将我们的方法应用于对青春期酒精和神经发育联盟(NCANDA)的621名青少年参与者的年度自我报告评估,以预测负面价,这是根据NIMH研究领域标准(RDOC)的重大抑郁症的症状。我们的方法成功地识别了进一步解释症状的危险因素类别。
translated by 谷歌翻译
将机器学习算法转换为临床应用需要解决与解释性有关的挑战,例如考虑混杂变量(或元数据)的影响。混杂变量会影响输入训练数据和目标输出之间的关系。当我们在此类数据上训练模型时,混杂的变量会偏向于学习功能的分布。最近有前途的解决方案元数据归一化(MDN)估计了基于不可训练的封闭形式解决方案的元数据与每个特征之间的线性关系。但是,该估计受到迷你批量的样本量的限制,因此可能导致该方法在训练过程中不稳定。在本文中,我们通过应用罚款方法(称为PDMN)扩展了MDN方法。我们将问题投入到双层嵌套的优化问题中。然后,我们使用惩罚方法近似此优化问题,以便MDN层中的线性参数可以训练并在所有样本上学习。这使PMDN可以插入任何架构,甚至可以运行批处理级操作,例如变形金刚和经常性模型。我们在合成实验中使用PMDN和MDN的混杂因素和更大的独立性表现出了更大的独立性,并且在合成实验中和多标签的多站点的磁共振图像数据集(MRIS)。
translated by 谷歌翻译